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I— I A message-passing algorithm for counting short cycles in a graph is presented. For bipartite graphs, which are 

H 

of particular interest in coding, the algorithm is capable of counting cycles of length g, g + 2, . . . ,2g — 2, where g 
Q is the girth of the graph. For a general (non-bipartite) graph, cycles of length g, 5 + 1, . . . , 2g — 1 can be counted. 

The algorithm is based on performing integer additions and subtractions in the nodes of the graph and passing 
^ extrinsic messages to adjacent nodes. The complexity of the proposed algorithm grows as 0{g\E\'^), where \E\ is 

\^ the number of edges in the graph. For sparse graphs, the proposed algorithm significantly outperforms the existing 

^y-^ algorithms in terms of computational complexity and memory requirements. 
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Index Terms 

Counting cycles in a graph, bipartite graph, girth, short cycles, low-density parity-check (LDPC) codes. 

I. Introduction 

Graphical models are widely used in different branches of science and engineering to represent systems 
and facilitate the description of inference algorithms. The structure of the graphs consequently plays an 
important role in the dynamics of the system and the performance of the corresponding algorithms. One 
important example, which has many applications in areas such as artificial intelligence, signal processing 
and digital communications, is the factor graph representation of systems and the sum-product algorithm 



[10|. Factor graphs are bipartite graphs and the sum-product algorithm is a generic message-passing 



algorithm which operates in a factor graph. One notable application of factor graphs and message-passing 

A preliminary version of this paper was presented at the 2010 IEEE Information Theory Workshop, Cairo, Egypt, Jan. 6 - 8, 2010. 



algorithms is in channel coding, where widely popular schemes such as turbo codes Q and low-density 
parity-check (LDPC) codes [|7| can be considered as specific instances. In particular, a specific instance 



of a factor graph is a Tanner graph [14|, which is used to represent an LDPC code. In fact, LDPC 
codes, which are famous for their capacity-approaching performance on many communication channels, 
owe their popularity to the good performance of the iterative message-passing algorithms that can decode 
these codes with relatively low complexity. The low complexity is a consequence of the sparsity of the 
Tanner graph. 

In practical error correction schemes, finite-length codes have to be used. For such codes, the perfor- 
mance of the message-passing algorithms is closely related to the structure of the graph, in general, and 
its cycles, in particular. In pT| , the girth distribution of the Tanner graph was related to the performance 
of an LDPC code. Numerous publications since have used the cycle structure of the Tanner graph as an 
important measure of performance of LDPC codes, with the general belief that for good performance, short 
cycles should be avoided in the Tanner graph of the code. In Q, the authors devised a code construction, 
known as progressive edge growth (PEG), to maximize the local girth of the graph in a greedy fashion. 
Halford and Chugg [[8| showed that in addition to the girth, the number and statistics of short cycles 



are also important performance metrics of the code. In [15|, error rates of finite-length LDPC codes 
were accurately and efficiently estimated by enumerating and testing the subsets of short cycles as error 
patterns. More recently, Asvadi et al. [|2| devised cyclic liftings that improve the error floor performance 
of LDPC codes significantly by breaking up the short cycles involved in the dominant trapping sets of the 
base code. The close relationship between the performance of graph-based coding schemes and the cycle 
structure of the graph, especially the number of short cycles, motivates the search for efficient algorithms 
that can count cycles of different length in the graph. In the context of coding, the graph is often bipartite. 
This includes the Tanner graph of LDPC codes. 

Counting the number of cycles in a general graph is known to be a hard problem [6|. Alon et al. 
[[T| presented methods for counting short cycles in a general graph. The complexity of their algorithm 
however is prohibitively high for longer cycles, say beyond 7. Fan and Xiao [[5| presented a method 
for counting cycles of length 2k, 2 < A; < 5 in the Tanner graph of LDPC codes. The complexity of 
their method is 0(m'^+^) where m is the number of the check nodes in the graph. Their method quickly 
becomes prohibitively complex even for counting cycles as short as 6, particularly in graphs with large 
m. An algorithm with similar complexity was proposed in [j4| for counting only the shortest cycles of a 
Tanner graph. Halford and Chugg |(8| presented a method for counting short cycles of length g,g + 2 and 
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g + A in bipartite graphs with girth g. The complexity of their method is 0{gn^), where n is the size of 
the larger set between the two node partitions. 

In this paper, we present an algorithm that counts the cycles of length g, g + 2, . . . ,2g — 2 in a bipartite 
graph. The algorithm is based on message-passing on the edges of the graph, where the messages are 
computed at the nodes with integer additions and subtractions. The algorithm can also be applied to 
general (non-bipartite) graphs to count cycles of length g, g + 1, . . . ,2g — l. The complexity of the proposed 
algorithm is 0{g\E\'^), where \E\ is the number of edges in the graph. For sparse bipartite graphs, the 
proposed algorithm can significantly outperform the algorithm of [8] in terms of both computational 
complexity and memory requirements. As an example, for a regular graph with node degrees 3 and 6 
corresponding to an (8000,4000) LDPC code, the proposed algorithm is more than 30 times faster than 
the method of ||8| and requires less memory by a factor of about 600. Conceptually also, the proposed 
algorithm is much simpler than the algorithm of [8], in which tedious matrix equations are involved in 
the counting process. Noteworthy is also the fact that for graphs with g > 6, the proposed algorithm is 
capable of counting short cycles of lengths up to at least the same value as the algorithm of [[8| does. 

The remainder of this paper is organized as follows. Basic definitions and notations are provided 
in Section II. In Section III, we develop the proposed algorithm and give a simple example. In our 
presentation, we use bipartite graphs for the sake of simplicity and for the reason that the graphs involved 
in most coding applications are bipartite. The pseudo code for the algorithm is presented in Section IV. 
Discussions on complexity and memory requirements and comparisons with the algorithm of [^SJ will 
follow in Section V. Section VI contains numerical results. Section VII concludes the paper. 

II. Definitions and Notations 

An undirected Graph G = (V, E) is defined as a set of nodes V and a set of edges E, where E is some 
subset of the pairs {{u, v} : u,v eV, u ^ v}. In this definition and without loss of generality in the context 
of this paper, we exclude loops using the condition u ^ v. Parallel edges are also indistinguishable by this 
definition and are excluded for simplicity. A walk of length A; in G is a sequence of nodes tii, t>2, . . . , Vk+i 
in V such that {fj, i^j+i} G E for alH G {1, . . . , k}. Equivalently, a walk of length k can be described 
by the corresponding sequence of k edges. A walk is a path if all the nodes t>i, t>2, . . . , are distinct. 
A walk is called closed if the two end nodes are identical, i.e., if vi = Vk+i in the previous description. 
A cycle of length A; is a closed path of length k. In a graph G, cycles of length k, also referred to as 
k-cycles, are denoted by Ck- We use Nk for \Ck\. A closed walk is referred to as lollipop-style if no two 
consecutive edges of the walk are identical. By definition, a lollipop-style closed walk contains at least 



one cycle. To each (undirected) walk (cycle), we associate two directed walks (cycles), depending on 
which end node or edge is selected as the starting point. This concept is important in the description of 
the proposed algorithm since the direction of edges is of consequence in message-passing algorithms. 

A graph G{V, E) is called bipartite if the set V can be partitioned into two disjoint subsets U and W 
{V = U VJW and U nW = such that every edge in E connects a node from [/ to a node from W . 
We denote \U\ by n and \W\ by m. Tanner graphs of LDPC codes are bipartite graphs, in which U and 
W are referred to as variable nodes and check nodes, respectively. Parameters n and m in this case are 
the code block length and the number of parity check equations, respectively. 

The girth (7 of a graph is the length of a shortest cycle in the graph. For bipartite graphs, all cycles 
have even lengths and g is an even number. The number of edges connected to a node v is called the 
degree of v, and is denoted by d^. We call a bipartite graph G = {U VJW, E) regular if all the nodes in 
U have the same degree du and all the nodes in W have the same degree d^,. Otherwise, the graph is 
called irregular. For a regular graph, it is easy to see nd^ = md^ = 1-^1- 

in. Main Ideas 

A. Message Passing 

A message -passing algorithm operates in a graph by computing messages at the nodes and passing them 
along the edges to the adjacent nodes. A well-known example is the sum-product algorithm operating in 



a factor graph [10|. Message passing algorithms often have the property that a message sent along an 
edge e is not a function of the message previously received along e. We refer to this property as extrinsic 
message-passing. An example is shown in Fig. [T] where the operation at node vi is multiplication. Extrinsic 
message-passing, for example, is known to be an important property of good iterative decoders [13|. The 
algorithm proposed in this paper also has this property. 

For bipartite graphs G{U U W, E), a natural message-passing schedule is for every node in U to send 
messages to adjacent nodes in W followed by every node in W to send messages to adjacent nodes 
in U. This is referred to as parallel schedule and is used often in iterative decoding algorithms. In this 
case, a complete cycle of message-passing from U to W and then from W to U is called one iteration. 
We assign discrete time t to message-passing, starting from time index zero followed by positive integer 
values. Corresponding to a time index t > 0, we associate an iteration number i = [t/2\ + 1 > 1. The 
time indices t = 2£ — 2 and t = 2i — 1 correspond to the first and the second halves of the iteration £. 
We also refer to messages passed at t = as initial messages, and use the notation mu\w for a message 
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Fig. 1. An extrinsic message-passing algorithm: a) messages received by v\ at t, b) messages sent by t;i at f + 1 




Fig. 2. Message passing for a cycle of length 2k. a) initial message X is passed along ei , b) after k iterations, v\ receives X along e2. 



passed from node u to node w at iteration The notations rn^^X and m^^l are used for the incoming and 
the outgoing messages to and from node u along edge e at iteration £, respectively. 

In the general context of iterative decoding, all nodes in the same partition {U or W) perform the 
same type of operation to generate their messages. The types of operation however are usually different 
for the two partitions and depend on the nature of the algorithm and the domain in which the messages 
are presented. In the algorithm developed in this paper, however, all the nodes perform the same type of 
operation. The messages are all monomials and the operation is multiplication. An example can be seen 
in Fig. [1] In this work, a monomial is the product of integer powers of variables. For example, a message 
m = XIX^X^ is a monomial with variables Xi, X2 and X3. We say m contains i copies of Xi, j copies 
of X2 and k copies of X3. If the variables are ordered, we may use a simpler representation of m as 
a vector: m = {i,j,k). Using the vector representation of messages, the multiplication of monomials is 
reduced to the addition of the corresponding vectors. 
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B. Algorithm Development 

Consider an extrinsic message-passing algorithm in a graph with messages as monomials and node 
operations as monomial multiplication. In the following, we explain how such an algorithm can count 
short cycles of the graph. Consider a cycle C of length 2k as depicted in Fig. [Sj^a). Suppose that node 
f 1 of C passes the monomial X as the initial message at t = to f2. Due to the extrinsic property of 
message-passing, X will be passed to from f 2 at t = 1 and continues its journey around the cycle, 
one node at a time, until it reaches back to f 1 at t = 2A; — 1 and at the end of iteration k, as shown in 
Fig. |2];b). Clearly, if node vi had also passed a monomial Y along the edge 62 to V2k at t = 0, it would 
have also received Y from V2 along ei at the end of iteration k. So the iteration number at which node 
f 1 receives back the messages it passed at the first iteration is half the length of the cycle. The following 
lemma puts this basic idea in the context of the message-passing in a general graph. 

Lemma 1: Suppose that C is a cycle of length 2k in a bipartite graph G = (y,E), and f G y is in 

C. Denote the two adjacent edges of f in C by ei and 62- Assume that the message -pas sing algorithm 
is initiated on the side of the graph which includes v by passing 1 along every edge in E, except ei and 
62. For ei and 62, the initial messages are monomials Xi and X2, respectively. Then, at iteration k, node 
V will receive one copy of X2 and one copy of Xi along ei and 62, respectively, where both copies have 
traveled through all the edges of C. 

Proof: The proof is straight forward and follows directly from the definition of extrinsic message- 
passing. ■ 

It is easy to see that if the node v in Lemma [T] is in N^j^'^''^'^ cycles of length 2k which all include ei 
and 62, then at iteration k, node v will receive N^j^'^'^'^ copies of X2 and N^'^^''^'^ copies of Xi along ei 
and 62, respectively, where each pair of copies has traveled through all the edges of one of the cycles, 
respectively. Assuming there are no additional copies of X2 received by v along ci and no additional 
copies of Xi received by v along 62 at iteration k, the monomials received at iteration k hy v along ei 
and 62 are respectively ^'^ and X^ ^'^ 

We note that in addition to copies of X2 which are received by node v along ei at iteration k, v may 
also receive copies of Xi along ci at iteration k. These correspond to closed walks of length 2k which 
start and end at edge ei, and are clearly not cycles. To eliminate these structures in the counting process 
of N^^'^'^'^ , one should consider the power of received variables along ei and 62 excluding the initial 
message. To describe this, we use the notation m ^ to denote the incoming message to node v along e\ 
at iteration k, excluding the variable of the initial message passed by v along ei. In the above scenario. 
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{a) (b) (c) 

Fig. 3. Three problematic structures for which the incoming extrinsic messages do not represent cycles. 

we have m = Xn , and m „ = X, . This results in 

N-t'^^ = {ex(m(;) .J + cx{mf .J}/2, (1) 

where ex(-) is the exponent of the monomial, defined as the sum of the powers of all its variables. 

There is also a possibility that node v receives additional copies of X-2 along ei and additional copies of 
Xi along 62 at iteration k. These additional copies travel either through the same cycle multiple times or 
through non-cycle loUypop- style closed walks of length 2k which start and end at ci and 62, respectively. 
Examples of the latter structures are given in Fig. |3} where the message X is initiated at node Vi. In 
Fig. [3];a), 2k is in fact the sum of the lengths of the two cycles Ci and C2, while in Fig. [3]^b), it is the 
sum of the lengths of the two cycles plus twice the length of the path between Vi and Vj. In Fig. [3]^c), 
message X travels from vi to Vi first, and then from Vi to Vj through Vk- It then travels back from Vj to 
Vi through Vl followed by a trip from Vi to Vj through Vk for the second time. The journey finally ends 
when X is passed back from Vj to Vi. In this case, the total length of the walk is 2k. 

A careful inspection of the problematic structures, as described above, reveals that they all include at 
least two cycles. This implies that the shortest length of such structures is 2g, where g is the girth of the 
graph. We thus have the following: 

Lemma 2: Consider a bipartite graph G = (V, E) with girth g. Select a node v E V with two adjacent 
edges ei and 62- Assume that the message-passing algorithm is initiated at t = by passing 1 along 
every edge in E, except ci and 62- For ci and 62 the initial messages are set to monomials Xi and X2, 
respectively. Then, at iteration k,k < g/2, node v will only receive 1 along all its edges including ei and 
62- At iteration k, g/2 < k < g — 1, node v will receive monomials XIX2 and X^ ^'^ X2 along ci 
and 62, respectively, where i and j are non-negative integers. Equation ([T]) is thus valid for k < g — 1. 
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Proof: Node v will receive messages other than 1 only if a copy of Xi or X2 is passed back to it. 
Due to the extrinsic nature of message-passing, such a copy must travel through a lollipop- style closed 
walk with both ends at v. Since the length of a lollipop-style closed walk is at least g, no messages other 
than 1 will be received by v at iterations k,k < g/2. At iterations k,g/2 < k < g — 1, node v can 
receive copies of Xi and X2 that have traveled through lollipop- style closed walks with both ends at v. 
In particular, the number of copies of Xi and X2 that v receives at iteration k > g/2, along 62 and ei, 
respectively, is equal to the number of lollipop- style closed walks of length 2k that start and end at ei 
and 62- For k in the range g/2 < k < g — 1, such lollipop-style closed walks are limited to cycles of 
length 2k that include ei and 62- (For k > g, in addition to cycles, they can include multiple trips over 
the same cycle or cases such as those in Fig |3j) ■ 
Let us now focus on the problem of counting all the cycles which pass through a certain node v in a 
bipartite graph G = (U UW, E). Without loss of generahty, we assume v E U. One approach to count all 
the cycles containing v is to use Lemma 2 and count the cycles involving different adjacent edges, two at 
a time, and then add up the results for any cycle length. The following lemma however suggests a more 
efficient approach. 

Lemma 3: Consider a bipartite graph G = (U U W, E) with girth g, and a node v E U. Initiate the 
message-passing algorithm by passing 1 on all the edges connected to nodes u E U, u ^ v, while passing 
different monomials, say Xi, X2, . . . , X^^, along the edges connected to v : ei, . . . , e^^, respectively. 
For A; < — 1, we then have 



where A^gfc the number of 2A;-cycles containing v. 

Proof: At iteration k < g — 1, consider the message received by v along ej,j = 1, . . . ,dy, excluding 




(2) 



the variable Xj. In this extrinsic message m 



^. , the power of variable Xj, i 7^ j, is N^l^"'^^ We therefore 



have 



d, 




This combined with 




completes the proof. 
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Fig. 5. Message passing of the proposed algorithm for three iterations in the graph of Fig. [4] 



In Lemma |3} at iteration k,k < g/2, node v will only receive 1 along all its edges, indicating there are 
no cycles of length g — 2 ox smaller containing v. 

It is worth noting that the message-passing algorithm can be simplified by allowing node v to always 
pass 1 after the first iteration. This is demonstrated in the following example. 
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C. A Simple Example 

Here, we illustrate the proposed method by a simple example. Consider the bipartite graph G shown in 
Fig. [4]; a), where the nodes in U and W are represented by hollow and full circles, respectively. Suppose 
that we are interested in counting short cycles containing node ui. For the simplicity of presentation, as 
shown in Fig. Qb), we can unwind the graph G from node Ui. It is easy to see from Fig. Qb) that the 
girth of G is 4. Using the purposed method, we can thus count cycles of length up to 2(7 — 2 = 6. The 
message-passing algorithm is illustrated in Figures |5]^a)-(f): 

(a) At t = 0, the algorithm is initiated by node ui passing messages Xi, X2, and X3 along its 3 edges. All 
the other messages sent by nodes U2, M3 and along their edges are equal to 1, and not shown. [Equiva- 
lently, in the vector representation, the initial messages of node Ui are vectors (1, 0, 0), (0, 1, 0)and(0, 0, 1), 
while all the other messages are (0, 0, 0).] 

(b) At t = 1, only the nodes in W are active. The corresponding (non-one) messages are shown in Fig 
5(b). Note that in this iteration (£ = 1), all the incoming messages to node ui are equal to one. 

(c) At t = 2, nodes in U are active. They all pass extrinsic messages using multiplication. For example, 
m^us^w4, = mi\U«3 X mL\U«3 = X1X2. [In the vector representation, m^u^^wi = (1,0,0) + (0,1,0) = 
(1,1,0).] 

(d) At t = 3 (£ = 2), for the first time node ui receives non-one messages, an indication that there is 
at least one cycle of length 2£ = A containing Ui. Using we obtain N^"- = (1 + 2 + l)/2 = 2. 

(e) At t = 4, the nodes in U are active and pass messages. 

(f) At t = 5 (£ = 3), nodes in W are active. Again in this iteration, node ui receives non-one messages, 
an indication that it belongs to at least one 6-cycle. Using Q, we have iY^;*^ = (2 + 1 + l)/2 = 2. 

IV. Proposed Algorithm 

A. Pseudo Code 

To count the short cycles of a certain length 2k in the whole graph G = {U VJW.E), one can apply 
the proposed algorithm described in the previous section to every node in one of the node partitions, U 
or W , and then add up the results for each cycle length. In this case, for each cycle length, the result 
should be divided by k as every cycle is counted k times: 

N2k = {Y.N^,)/k = (5^ iV-)A ,^-<k<g-l. (3) 

u&U w&W 
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To simplify the algorithm and to avoid the k-fold counting repetition, we can deactivate a node as soon 
as its cycles are counted. This would be equivalent to removing the node and all its adjacent edges from 
the graph. Moreover, the algorithm can be further simplified by only activating nodes that have at least 
one non-one incoming message. Based on these simplifications, the proposed algorithm has the pseudo 
code provided in Algorithm [T| 

Algorithm [T] is initiated from U . Similarly, it can be initiated from W . Nodes in U are indexed by 
i = 1, . . . , ra, and notation niE^^wj^u,) is used to denote the incoming message from node Wj to node Ui 
excluding the initial variable passed from Ui to Wj. Notation N{u) is used for the nodes adjacent to u 
(neighbors of u). 

Here we have implicitly assumed that the girth g of the graph is known. In the following subsection, 
we discuss a modification of the algorithm that can compute g and Ng. 

B. Parallel Implementation 

The algorithm presented in the previous subsection is based on sequentially going through the nodes 
in one of the two partitions in the graph. To speed up the counting process and at the expense of larger 
memory usage, one can run a parallel version of the algorithm in which all the nodes in one partition are 
initialized simultaneously. This is explained in Fig. [6]; a) for the graph of Fig. |4} 

The parallel implementation, just described, can also be used to compute g and A^^^. To see this, note 
that in the parallel implementation, none of the nodes in the initiating partition will receive a copy of 
its initial messages before iteration g/2. At iteration g/2, the nodes which are contained in the shortest 
cycles will receive copies of their initial messages and all such copies are received along the edges whose 
initial messages differ from the received messages. This means that all the received copies represent true 
^-cycles. Therefore to compute g and Ng, one does not need to distinguish among the initial messages of 
a node. The initialization in this case is explained in Fig. |6];b) for the graph of Fig. |4j In this setup, if the 
first iteration in which at least one of the nodes receives a non-one message is iteration k, then g = 2k, 
and the number of (^-cycles is equal to the total number of received non-one messages by all the nodes 
divided by 2/c. 

V. Complexity of the Proposed Algorithm 

A. Computational Complexity 

In the following, we arbitrarily assume that the algorithm is initiated from the node set U. We consider 
a sequential implementation, where the nodes in U are processed one at a time. We also consider the 
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Algorithm 1 Proposed Message-Passing Algorithm for Counting Short Cycles 
for /c = 1 : (? — 1 do 

counter (k) = 
end for 

for i = 1 : n do 
Initialization 

/ = 1 

for Wj e N{ui) do 
m(°) - Y 

I'l'Ui—^Wj — -^l 

l^l + l 

end for 

for i' = -i + 1 : n do 
for Wj e N{ui>) do 

777^°) - 1 

end for 
end for 

for A; = 1 : gr — 1 do 
Message Passing from W 

for J = 1 : m do 

for Ui' e N{wj) do 

(2fc-l) _ T-r (2fc-2) 

rriwj^Ui, — l[uheN{wj),h>i,h^i'^Uh^wj 
end for 
end for 

Counting Cycles 

local-counter (k) = E^,eiVK) 

Message Passing from U 

for i' = i + 1 : n do 
for Wj e N{ui') do 

{2k) _ T-r (2A:-1) 

end for 
end for 

end for 

for /c = 1 : 5f — 1 do 

counter {k) — counter{k) + local-counter {k)/ 2 
end for 
end for 



vector representation of messages and first derive the complexity for a regular graph. We then generahze 
the results to irregular graphs. For a regular graph G — {U\JW,E), starting from a node u e U, there are 
du initial messages, each represented by a unit vector of length du. AH the subsequent messages are also 
vectors of length du- To calculate the messages at an active node w e W, we first add all the incoming 
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vectors to w, and then subtract from this, the incoming message along each adjacent edge to obtain the 
outgoing message along that edge. This requires (2(i^ — l)du integer additions and subtractions. Similarly, 
for each active node u E U, we need {2du — l)du integer additions and subtractions to obtain the outgoing 
messages. Considering that in even and odd time instances, the number of active nodes are upper bounded 
by n and m, respectively, the number of operations per iteration is 0{nd\ + mdudy) = 0{\E\du). Since 
the algorithm needs to perform g — 1 iterations, the complexity of the algorithm for each node u E U is 
0{gnd'l + gmdudy) = 0{g\E\du). The total complexity is thus 

0{gn^dl + gnmd^d,) = 0{gn^dl) = 0{g\E\^) . 

It is easy to see that the same complexity order also applies to irregular bipartite graphs. 

In the above discussions, it is implicitly assumed that the girth of the graph is known a priori. Since 
the computational complexity of finding the girth is at most 0{n^), e.g., based on the algorithm of [TT|j^ 
the extra complexity for computing the girth is negligible compared to the rest of the computations. 

B. Memory Requirements 

For each edge of the bipartite graph, we need two memory locations to store the message vectors in 
both directions. For a regular graph, since each vector has d^ elements, the total number of memory 
locations, each storing an integer number, is 2du\E\ or 0{ndl^) = 0{du\E\). For an irregular graph, the 
storage complexity is 0{dmax\E\), where d^ax is the maximum node degree in U or W, depending on 
which side initiates the algorithm. 

'it is easy to see that if we use the algorithm proposed in Section IV.B to compute g, the complexity is 0{gn^du), which is in general 
larger than that of jllj . The algorithm of ^llj however only finds g, while the proposed algorithm also computes Ng. 
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TABLE I 

Number of Short Cycles in the Tanner Graphs of Four Rate-1/2 LDPC Codes 
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C. Comparison with the Algorithm of jj^ 

First, it is important to note that while the algorithm of ^ is limited to bipartite graphs, the proposed 
algorithm is capable of counting short cycles in a general (non-bipartite) graph. For bipartite graphs, the 
algorithm of [8| counts cycles of length g, g + 2, g + A, while the proposed algorithm counts cycles of 
length g, g + 2, . . . ,2g — 2. The coverage of the proposed algorithm is thus at least as much as the algorithm 
of [[8| for graphs with g > 6. It should be noted that the Tanner graphs of almost all good LDPC codes 
have g > 6. 

The computational complexity of the algorithm of [[sj is 0{gn^), where n = max(|f/|, \W\). The 
complexity of the proposed algorithm is 0{g\E\'^). One can thus see that for sparse graphs with \E\ 
growing slower than n^/^, the complexity of the proposed algorithm is less than that of the algorithm in [[sj. 
Moreover the computations in the algorithm presented here are simple integer additions and subtractions, 
while in [j8| the operations are mainly high-precision multiplications. 

In terms of memory requirements, the algorithm of [8] requires at most ll(n^ + m^) + 21nm high 
bit- width (64-bit integer) storage locations, which is of order O(n^). The proposed algorithm on the other 
hand requires 2du\E\ memory locations, i.e., 0{dmax\E\), which for sparse graphs can be much smaller 
than what is needed for the algorithm of [8|. Moreover, the maximum size of memory locations for the 
proposed algorithm, which is proportional to the number of cycles, is usually much less than 64 bits. 

VL Numerical Results 
In this section, we present numerical results obtained by applying the proposed algorithm to Tanner 



graphs of LDPC codes. We consider four rate-1/2 codes from [[16|. Codes A and B are listed in [16| 
as PEGirRegbOAxlOOS and PEGRegbOAxlOOS, respectively. Both codes are constructed using the 
Progressive Edge Growth (PEG) method of [9], and have n = 1008 and m = 504. Code A is irregular 
while Code B is regular. Codes C and D are MacKay's codes 8000.4000.3.483 and 10000.10000.3.631, 
respectively. They are both regular with du = 3 and d^ = 6. For Code C, n = 8000 and m = 4000, 
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TABLE n 

CPU Time and Memory Requirements for the Proposed Algorithm 





CPU Time 


Max Memory 


Max Swap 




(S) 


(MB) 


(MB) 


Code A 


5.3 


0.36 


3.3 


Code B 


3 


0.36 


2.8 


Code C 


155 


13 


157 


Code D 


1127 


13 


157 



TABLE III 

CPU Time and Memory Requirements for the Algorithm of |[8| 





CPU Time 


Max Memory 


Max Swap 




(S) 


(MB) 


(MB) 


Code A 


10.3 


1.5 


35 


Code B 


16.6 


1.5 


35 


Code C 


4965 


7839 


14195 


Code D 









while these parameters for Code D are 20, 000 and 10, 000, respectively. The number of short cycles in 
the Tanner graphs of these codes is listed in Table |I} Codes A, C and D have girth 6 and the proposed 
algorithm, similar to the algorithm of [8|, can compute Nq, Ng and Niq. Code B however has girth 8, and 
while the algorithm of [[8| can only compute Ng, Niq and the proposed algorithm can also compute 

iVl4. 

Tables [II] and III show the running time and memory requirements of the proposed algorithm and the 
algorithm of fs^j^ respectively. Both algorithms were run on the same machine with a 2.2-GHz CPU and 
8 GB of RAM. As can be seen, the proposed algorithm is consistently faster than the algorithm of [[8| 
and requires significantly less memory for larger graphs. In fact, for Code D, the algorithm of [[8| ran out 
of memory and was not able to find the results. 

As another experiment, we randomly generate six parity-check matrices for each of the following three 
rate- 1/2 LDPC code ensembles: (du,dw) = (3,6), (4,8), (5, 10). The lengths for each degree distribution 
are: n = 200, 500, 1000, 5000, 10, 000 and 20, 000. In the generation of the parity-check matrices, 4-cycles 
are avoided. The proposed algorithm is then used to count the short cycles of each parity-check matrix. 



The results, which are reported in Table IV, show that while there is a large difference between the short 
cycle distribution of different degree distributions, the changes with respect to the block length for the 
same degree distribution are negligible. This would imply that the complexity of the algorithms which are 

^To implement the algorithm of ^8j|, we used the authors' code in |l7| . 
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TABLE IV 

Distribution of short cycles in the Tanner graphs of Rate-1/2 random regular LDPC codes with 
different degree distributions and different block lengths 



Degree 


anort Cycle 


Code Lengths 


Ulh li lU li lIUll 


l^lali lUUllUIl 


200 


500 


1000 


5000 


10000 


20000 




Ne 


171 


167 


181 


156 


166 


148 


(3,6) 


Ns 


1265 


1239 


1226 


1235 


1253 


1285 




AT 


10069 


10110 


9939 


9982 


9858 


9974 




Ne 


1636 


1611 


1584 


1562 


1537 


1572 


(4,8) 


Ns 


25005 


24419 


24379 


24363 


24529 


24557 




Nio 


409335 


409373 


408595 


407958 


408246 


409051 


(5, 10) 


iVe 


8626 


8064 


8055 


7978 


7858 


7926 


Ns 


213639 


212484 


210767 


210153 


209614 


210159 




Nio 


6052158 


6054661 


6049148 


6043400 


6049583 


6043704 



based on the enumeration of short cycles in a Tanner graph is rather independent of the block length|j 

VII. Conclusion 

In this paper, we proposed a distributed message-passing algorithm to count short cycles in a graph. 
For bipartite graphs, the proposed algorithm counts short cycles of length g, g + 2, . . . ,2g — 2, where g is 
the girth of the graph. For non-bipartite graphs, the algorithm counts cycles of length g, g + 1, . . . ,2g — 1. 
The operations performed by the algorithm are integer additions and subtractions, and the computational 
and storage complexities of the algorithm are 0{g\E\'^) and 0{dmax\E\), respectively, where \E\ and dmax 
are the number of edges and the maximum node degree in the graph, respectively. For sparse graphs, the 
proposed algorithm is significantly faster and requires substantially less memory compared to the existing 
algorithms, particularly for larger graphs. 
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